Chapter 15 CLUSTERING METHODS
نویسندگان
چکیده
This chapter presents a tutorial overview of the main clustering methods used in Data Mining. The goal is to provide a self-contained review of the concepts and the mathematics underlying clustering techniques. The chapter begins by providing measures and criteria that are used for determining whether two objects are similar or dissimilar. Then the clustering methods are presented, divided into: hierarchical, partitioning, density-based, model-based, grid-based, and soft-computing methods. Following the methods, the challenges of performing clustering in large data sets are discussed. Finally, the chapter presents how to determine the number of clusters.
منابع مشابه
Data Clustering and Graph-Based Image Matching Methods
This thesis describes our novel methods for data clustering, graph characterizing and image matching. In Chapter 3, our main contribution is the M1NN agglomerative clustering method with a new parallel merging algorithm. A cluster characterizing quantity is derived from the path-based dissimilarity measure. In Chapter 4, our main contribution is the modified log-likelihood model for quantitativ...
متن کاملImproving text clustering for functional analysis of genes
vi Chapter 1 Literature Review and Requirements Analysis 1 1.1 Functional analysis of microarray data 1 1.2 Gene Ontology-based functional analysis 1 1.3 Literature-based functional analysis 4 1.3.1 Assuming similar expressions imply same functional pathway 5 1.3.2 Not assuming similar expressions imply same functional pathway 7 1.4 Hybrid systems 10 1.5 Requirements analysis 11 Chapter 2 Strat...
متن کاملWeb Data Clustering
This chapter provides a survey of some clustering methods relevant to clustering Web elements for better information access. We start with classical methods of cluster analysis that seems to be relevant in approaching the clustering of Web data. Graph clustering is also described since its methods contribute significantly to clustering Web data. The use of artificial neural networks for cluster...
متن کاملData Clustering: from Documents to the Web
The chapter provides a survey of some clustering methods relevant to the clustering document collections and, in consequence, Web data. We start with classical methods of cluster analysis which seem to be relevant in approaching to cluster Web data. The graph clustering is also described since its methods contribute significantly to clustering Web data. A use of artificial neural networks for c...
متن کاملDetecting stable clusters using principal component analysis.
Clustering is one of the most commonly used tools in the analysis of gene expression data (1, 2) . The usage in grouping genes is based on the premise that co-expression is a result of co-regulation. It is thus a preliminary step in extracting gene networks and inference of gene function (3, 4) . Clustering of experiments can be used to discover novel phenotypic aspects of cells and tissues (3,...
متن کامل